Down-line Transcription System Having Context Sensitive Searching Capability

ABSTRACT

A context sensitive searching front-end is disclosed for use in a deposition or trial proceeding wherein a computer aided transcription terminal provides real-time transcribed text down-line to attorney terminals. The terminals may thereafter use the transcribed text and any other text currently being displayed to formulate searches with little or no typing interaction required. Other text which may be used as a basis for searching includes communications from other attorney terminals, from artificial intelligence objection messages, and personal notes. Searching may be conducted on natural language or boolean front-ends which provide virtually instant feed-back as to the value of a search formulation before and after any “searching” actually occurs. Graphing of search results, including individual search word contribution, is provided for modification and selection of the documents to be reviewed. Library selection for searching is provided by analyzing the context from which the search originated, and from the actual words selected for searching. A database structure is also disclosed for providing backward referencing into the actual locations of search words without having to search the text files.

CROSS-REFERENCE TO RELATED APPLICATIONS Claiming Benefit Under 35 U.S.C.120

This application is a continuation-in-part application of pending U.S.application Ser. No. 08/036,488, filed Mar. 24, 1993, by Bennett et al.(Attorney Docket No. P93-00).

INCORPORATION BY REFERENCE

The descriptive matter of the above-referred to pending U.S. applicationSer. No. 08/036,488, filed Mar. 24, 1993, by Bennett et al. (AttorneyDocket No. P93-00) is incorporated herein by reference in its entirety,and is made part of this application.

BACKGROUND OF THE INVENTION

This invention relates to a down-line transcription system used byattorneys for reviewing real-time transcription during a proceeding suchas a trial or deposition; and, more particularly, it relates to a methodand apparatus for providing context sensitive searching of a currenttranscript, other case evidence and case law which may be locally orremotely located.

As is well known, legal proceedings such as a deposition or trialinvolve the participation of, among others, an examining attorney whoasks questions and a witness who must answer (“testify”) while underoath. These answers (“testimony”) are recorded by the court reporter,along with the associated questions and related conversation, in adigitally coded shorthand format using a stenographic recorder. Recentversions of stenographic recorders communicate the digitally codedshort-hand to computer aided transcription (“CAT”) systems which attemptto automatically transcribe the coded shorthand into the exact text ofwords spoken. The CAT systems transmit the transcribed exact text alongwith occasionally interspersed coded shorthand (when automatedtranscription fails) down-line for real-time viewing by attorneys aswell as by other participants involved.

As is also well known, during depositions and trial, attorneys oftenfind it necessary to have immediate full-searching access to variousinformation such as the current transcript, other case evidence and caselaw. However, because of the current requirement of each searchingfront-end associated with such information, immediate access oftenproves impossible, and attorneys are generally forced to forego theirneeded access. In many cases, this proves to be detrimental.

For example, instead of having immediate access to case law, attorneysare required to log-in to remote databases, and, after entering a seriesof preliminary library selections, are faced with formulating and typinga search which they hope will locate the desired case law. Theformulated search must follow a syntax which is unique to the specificdatabase being searched. In addition, the syntax usually includes aboolean format involving the use of parentheses, boolean “and” and “or”type logical word operators, and a plethora of other specific syntaxcommands used to limit a given search. The entire process is very timeconsuming. Furthermore, because the first search formulation often doesnot yield the desired results, the attorney must reformulate andmanually re-enter the reformulated searches several times beforelocating the desired information.

Additionally, natural language searching front-ends have been addedwhich, in a very complex fashion, attempt to ascertain actual searchintent from an attorney's English language search request. Using thenatural language front-end, after logging-in to the remote case lawdatabase and selecting the appropriate libraries, the attorneyformulates a search in the form of a typical English language sentenceor sentences. The search is processed in a remote main-frame computingenvironment, and the case law offering the best fit is delivered to theattorney. Local case law databases are also available but require thesame preliminary library selection delays as with remote databasesearching. In addition, because of the computing power necessary, thenatural searching front-end does not run locally. Therefore, for locallystored case law, the problems associated with the use of a booleanfront-end must still be faced.

To search case evidence pleading and other work product, the attorneyfaces similar delays. First, case evidence is usually stored remotely atthe attorney's law offices. At best, this information is available via adial-up communication link. To search the case evidence, the attorneymust first selectively access the different databases, word processing,case management, and deposition review software packages which were usedto create or store the specific case evidence, pleadings or other workproduct. Thereafter, the searching front-end associated with the chosensoftware package requires the attorney to identify the appropriatedatabase and formulate a search using a syntax unique to the chosensoftware package.

With all of the different searching front-ends, preliminary searchingsetup requirements, various searching front-end differences andrequirements for formulating and typing in a search, searching isgenerally a time consuming endeavor requiring a great deal ofinteraction between attorneys, support staff, and search databases.Compounding the problem, if the attorney decides to search for the sameinformation across many databases, individual searches are required foreach such database. As a result, when time is of the essence, theattorney usually has no choice but to ignore the impractical possibilityof conducting a search. Unfortunately, time is generally always of theessence in the trial or deposition environment, where searching couldprove to be of ultimate value.

Moreover, currently available searching front-ends do not provide anattorney with sufficient information about the database being searchedto appropriately formulate or modify a search. Boolean type searchingfront-ends often yields literally hundreds of hits, yet such searchingfront ends provide the attorney no indication as to how to appropriatelyalter the search or how to provide a successful search in the firstplace. Similarly, current natural language front-ends provide noindication of: 1) how effective a search formulation may turn out to be;2) the computed significance weighting chosen for a given word; or 3)how to change a search to produce better results. As a result, not onlydo searches require multiple passes, but the attorney is also forced toreview documents which have very little chance of yielding the desiredsearch result.

Currently facing the foregoing problems are hundreds of thousands ofattorneys in the United States alone. Hence, it would be highlydesirable to solve the foregoing variety of problems enumerated inconducting legal proceedings such as a deposition or trial by providinga common searching front-end which provides seasonable response timewith minimal attorney interaction.

It is therefore an object of the present invention to provide a methodand apparatus which aids the attorney by permitting a common searchingfront-end which does not require formulation of a search during a trialor deposition.

It is an object of the present invention to provide a method andapparatus which provides searching capability based on contextual textreceived for other purposes.

It is an additional object of the present invention to provide a methodand apparatus which provides for simple search formulations based onmanipulation of previously available text received for alternatepurposes.

It is an additional object of the present invention to provide a methodand apparatus which provides real-time database feed-back regarding thecharacteristics of a database in view of a potential or current searchformulation.

It is another object of the present invention to provide a method andapparatus which aids the attorney by providing a common searchingfront-end which detects the context of a search without requiringinteractive log-in, library selection, or other preliminary searchingrequirements.

SUMMARY OF THE INVENTION

These and other objects of the present invention are achieved in anattorney terminal for performing database searching. The terminal has adisplay which can be controlled to display alphabetic and numeric textfrom a variety of sources, most of which being displayed fornon-searching reasons. The terminal selectively responds using asearching front-end to the search based on the displayed non-searchingtext.

Other objects are also achieved in a attorney terminal wherein thesearching front-end evaluates the context of the search to anticipatethe appropriate database searching destination. This may include be cuesfrom the setup information, the current state of the deposingenvironment, and the actual searches terms selected.

Other objects are also achieved in an attorney terminal wherein thesearching front-end selectively responds to classify the significance ofalphabetic and numeric text provided for non-searching purposes whichhave been selected for searching. In addition, only the alphabetic andnumeric text classified as significant is considered for the searching.

Other objects are also achieved in an attorney terminal having booleanand natural language searching front-ends which provide virtuallyimmediate feedback as to the value of the search. In some cases this maybe graphical, in others, merely numerical. The graphical feedback alsoprovides for immediate feedback as to the contribution of individualword(s) in the search.

Other objects and further aspects of the present invention will becomeapparent in view of the following detailed description and claims withreference to the accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram which illustrates an overall systemconfiguration of the present invention as used in a legal proceedingsuch as a deposition or trial.

FIG. 2 is a detailed perspective view of the display of a down-lineterminal according to the present invention which illustrates the use ofcontext sensitive searching based on a mischaracterization objection.

FIG. 3 a is a detailed perspective view illustrating a further displayof the down-line terminal in response to the mischaracterization searchselection detailed in FIG. 2.

FIG. 3 b provides an illustration of the operation of the interactivenatural language search of the present invention which includesgraphical database searching information aiding in the formulation of asearch.

FIG. 4 a is a detailed perspective view of the display of the down-lineterminal which illustrates context sensitive searching using the booleanlanguage searching front-end of the present invention.

FIG. 4 b is a detailed perspective view of the display of the down-lineterminal illustrated in FIG. 4 a providing specific detail as to theprocedure and controls used to conduct the boolean front-end search.

FIG. 5 is a detailed perspective view of the display of the down-lineterminal according to the present invention which further illustrates anatural language context-sensitive search of case law, the search havingbeen based on a communication from another terminal.

FIG. 6 is a perspective view of the display of the down-line terminaland searching context illustrated in FIG. 5 wherein a case law databaseis directly accessed.

FIG. 7 is a flow diagram illustrating operation of the computer softwareutilized by the down-line terminal of the present invention in providingfor an interactive natural language search.

FIG. 8 is a flow diagram illustrating the operation of the computersoftware utilized by the down-line terminal in providing for aninteractive boolean search.

FIG. 9 is a diagram representing the association of data fields intothree data records which are the basic building blocks for the overalltranscription structure according to present invention.

FIG. 10 is a detailed diagram representing the overall data structure ofthe cross-reference library used by the CAT system of the presentinvention to transcribe the key-stroke codes received from thestenographic recorder.

FIG. 11 is a diagram illustrating a set of data records involved in thedatabase indexing structure according to the present invention which isused by the natural language and searching front-end to provide forsearch word selection, verb conjugation, thesaurus text and usageinformation for optimizing a search.

FIG. 12 is a detailed diagram representing the database indexingstructure of the present invention which provides a backward index intothe actual text stored in the database to be searched.

FIG. 13 is a diagram illustrating the techniques used to compute thesignificance number of a given search word for adding in the naturallanguage and boolean search formulations.

DESCRIPTION OF THE PREFERRED EMBODIMENT

FIG. 1 is a block diagram which illustrates an overall systemconfiguration of the present invention as used in a legal proceedingsuch as a deposition or trial. A stenographic recorder 11 is used by acourt reporter at a deposition, hearing or other legal proceeding torecord digital coded signals representative of the words spoken as theyoccur in real-time. Using a communication link 19, the recorder 11transfers the representative signals to a computer aided transcription(“CAT”) system 13, a computer terminal used by the court reporter, fortranscription into alphabetic and numeric text corresponding to theactual words spoken. As a basis for transcription, the CAT system 13uses a cross-reference library which is stored in a transcriptiondatabase 33.

The CAT system 13 communicates the transcribed alphabetic and numerictext it generates along two independent communication links 20 and 21.First and second chair examining attorneys view the transcription onexamining attorney terminals 15 and 17, respectively. Similarly, firstand second chair defending attorneys view the transcription on defendingattorney terminals 16 and 18, respectively. Upon receipt of thecommunicated transcription, the attorney terminals 15, 16, 17 and 18 notonly display the alphabetic an numeric text, but also provide a varietyof tools for reviewing and evaluating what has been received. Concurrentwith receipt and display of the transcribed text, the examiningterminals 15 and 17 provide a vehicle for the first and second chairattorneys to exchange messages. Similarly, message exchanges between thedefending terminals 16 and 18 are provided for terminals 16 and 18.

The attorneys at the terminals 15-18 interact with the transcribed textreceived in a variety of ways such as to create associations with notescreated or messages received during the proceeding. A more completedescription of the interaction is set forth in the pending parent U.S.application Ser. No. 08/036,488, filed Mar. 24, 1993, by Bennett et al.(Attorney Docket No. P93-00), which is incorporated herein by referencein its entirety.

The attorneys also interact through the terminals 15-18 with local orremote case law 23 or 25, respectively. The local case law 23 includes aCD-ROM based database which may be accessed directly via links 20 or 21,or indirectly through requests to the CAT system 13 which manages thesearching via a communication link 31. The latter scenario is preferablefor billing purposes. Similarly, access to law suit databases storingcase evidence and possibly attorney work product is provided locallyand/or remotely via law suit databases 27 and 29, respectively. Althoughnot shown, it is also contemplated that the enumerated database could befurther distributed or duplicated in the configuration illustrated.

From all sources of text received and displayed by the terminals 15-18,the attorneys can freely select portions thereof for searching thevarious case law and evidence databases with little or no typing orother interaction required. Various search modification aids andcontextual analysis provide additional search preparation short-cutswhich provide the attorneys with rapid access to needed information.

Although preferred, neither a keyboard nor a screen are necessary forthe CAT system 13. In fact, the terminal itself, i.e., the functionalitythereof, might exist within other nodes on the transcription network.For example, the functionality of CAT system 13 might be fullydistributed within the recorder 11 and/or the examining attorneyterminal 15. The functionality might also be fully or partially locatedat some remote, off-site location. Similarly, the present inventioncontemplates many situations where terminals are not available for eachattorney present, such as where only the examining side uses terminalsor visa versa, or where either side utilizes a single terminal.Situations may arise where no attorneys possess a terminal. Although notshown in FIG. 1, additional terminals might be used locally by thewitness or off-site by magistrates, judges, clients, expert witnesses,or additional attorneys involved in the case.

FIG. 2 is a detailed perspective view of the display of the attorneyterminal 15 according to the present invention which illustrates the useof context sensitive searching based on an exemplary mischaracterizationobjection. As shown, each attorney terminal such as the examiningattorney terminal 15 includes a screen 51 and a keyboard 53. The screen51 is split into a transcription window 55 and a communication window57, having a common command line 59. The transcription window 55displays the transcribed text received from the CAT system 13, bysequentially displaying questions (Q's) and answers (A's—not shown) andmiscellaneous associated conversation (from the defending attorney Mr.Smith) in virtually real-time.

The communication window 57 provides a visual display in either a stackmode and an edit mode. In the stack mode, the communication window 57displays the first line of every communication received. The attorneyterminal 15 receives communications from three sources: 1) the keyboard53—personal notes “From: Self” or messages “To:” others; 2) theexamining attorney terminal 15 through artificial intelligence (“AI”)algorithms—“From: AI” conveying potential objections or depositionscheduling messages; and 3) other terminals on the network such as fromthe second chair examining attorney via the attorney terminal 17—forexample “From: 2nd”. Upon selection of a desired communication from thestack of messages illustrated in the window 57, the window 57 enters theedit mode, displaying the full text of the selected communication. Whilein the edit mode, communications may also be created, modified, deleted,copied, printed, or communicated to the other terminals such as theterminal 17.

Throughout the deposition or trial, the terminal 15 displays variouscontextual text, i.e., the transcript text and associated communicationsfrom all sources. In most circumstances, the displayed contextual textnot only creates the need to search case evidence or law, but alsoprovides a significant formulation of terms needed to conduct thatsearch. Instead of requiring a complete reformulation and retyping of asearch, the searching front-ends of the present invention provide forimmediate searching based on the available contextual text beingdisplayed. Additionally, when necessary, the searching front-endsprovide for rapid modification of the contextual text with minimalattorney interaction.

For example, objections to the form of question must be seasonable,therefore timing is of utmost importance. To aid the objection process,artificial intelligence software routines analyze the form of eachquestion and the content of each answer to provide various potentialobjections that the examining or defending attorney may want to takeinto consideration in attempting to achieve proper evidentiary form. Forexample, a search is made on each question for phrases such as “yousaid”, “you stated”, “you say”, etc. If found, the AI routinesimmediately send a potential objection to the communication window 57indicating that the question possibly mischaracterizes the witnesses'earlier testimony.

Specifically, as illustrated, in the fifty seventh question (Q57), theexamining attorney attempts to achieve proper form by summarizing thewitnesses previous answers which may have spanned several hours of thedeposition or trial. Each attorney knows that with a summarizingquestion, a short, affirmative answer thereto can be easily extractedfrom the witness. The attorney can then more effectively utilize theresulting question and answer in a brief, motion or other legal argumentinvolving the factual issues being summarized. However, in the attemptto summarize the prior testimony, the examining attorney is likely tomischaracterize the what was stated either intentionally to achieve somelegal advantage or unintentionally due to poor recall of the previousquestions and answers which are used for the summary.

The AI routines immediately detect the attempted summary such as isrepresented by Q57, and send a potential mischaracterization objection61 to the attorney terminal 15. In an attempt to avoid such a compactand potentially incorrect formulation of the facts, the defendingattorney, a “Mr. Smith” in the illustrated example, objects. Theobjection is registered by the attorney terminal 15 in the transcriptionwindow 55. The examining attorney may immediately respond by selectingthe objection message 61 (illustrated by highlighting), and thenselecting a search control 63 to initiate a search.

The terminal 15 responds to the selection of the search control 163 byfirst analyzing the context of the current display. By engaging in theanalysis, the terminal 15 finds that the search is most probably basedon the selected objection 61, not the terms therein. The terminal 15also identifies the text of the associated Q57 as the basis for asearch, and, most likely, the specific database and database units to besearched are the current transcript and Q & A's therein, respectively.The terminal 15 makes this determination based on the probability that,given the context, the attorney will want to find all of the previous Q& A's which were used in the attempted summary of Q57 involving thecurrent transcript so that a non-objectionable reformulation of thesummary can be made. As each previous Q & A's is located via a search,the examining attorney may sequentially read them into the recordwithout mischaracterization, and thereby achieve proper form.

FIG. 3 a provides an illustration of the specific response of theterminal 15 to the selection of the search command 63. When theexamining attorney chooses the search command 63 in response to themischaracterization objection, a pull-down search menu 65 appearsoffering two choices of searching front-ends and various selections ofdatabases and database units for searching. Initially, the attorney mayoverride the database and/or database units to be searched by “checking”those more appropriate. Similarly, additional databases and/or databaseunits may be selected. However, as previously stated, because of thecontext, the terminal 15 has automatically selected the currenttranscript and Q & A's therein for searching. In addition, changes inthe database or database units may be made at any time during a search,and the terminal 15 will continue the search through the new selectionswithout requiring search modifications.

Either a natural language or boolean searching front-end may be selectedfrom the pull-down menu 65. Once either is selected, the terminal 15automatically attempts to formulate a search based on the context whichmay be initiated with little or no modification or interaction required.In particular, the natural language front-end of the present inventionprovides fully functional searching capability which, in mostcircumstances, requires no typing by the attorney. In the illustratedexample, because the attorney selected the potential mischaracterizationobjection 61 before selecting the natural language search, the attorneyterminal 15 automatically formulates a natural language type search fromthe associated question, Q57.

First, the terminal 15 opens a natural front-end window 67 and displaysQ57 therein. Thereafter, using statistical and grammatical analysistechniques, the attorney terminal 15 identifies and highlights thosewords of Q57 believed to have the most significance. Two colors ofhighlighting are provided: 1) no highlighting for insignificant words;2) blue highlighting for words of normal significance; and 3) redhighlighting for highly significant words. The highlighting colors orlack thereof provides the attorney with instant feedback as to what theterminal 15 plans to use for the search.

After verifying the significance classification of terminal 15, theattorney may immediately initiate the search by selecting a “search”button from a button control panel 69. If the attorney does not agreewith the significance classification or would for any other reasondesire to modify the search, the search can be modified by a variety oftools available through the search window 67. In particular, theattorney may change the particular significance of a search word to thecontext of the current case which the terminal 15 did not, or could not,detect. In particular, highlighting priority may be toggled from nohighlighting, to blue, to red and back to a non-highlighted condition byrepeated selecting a given word. The attorney may also “double-click”(quickly select a search word twice) on a search word and significancenumber, and the number of times the word exists in the search databaseis displayed.

Additionally, if necessary the attorney may directly add additionalterms or phrases via the keyboard 53 or by copying additionallydisplayed text into the window 67. The terminal 15 considers all newwords added as significant (blue highlighting) unless a reassess buttonis selected from the button control panel 69. If selected, the terminal15 applies the same statistical and grammatical analysis techniques onthe newly added words. Thereafter, the attorney may again interactivelychange the resulting significance classifications, add additional wordsand reassess as necessary until the appropriate search is achieved. Oncethe attorney is satisfied with the formulated search, the attorneyinitiates the search by selecting the search button of the buttoncontrol panel 69.

The term “search” as used herein is defined as follows. In the preferredembodiment (as will become apparent in reference to the FIGS. 11 and 12below), instead of requiring an actual word search through textualdatabase, an index (a database indexing structure) is prepared for thedatabase which associates each word in the database, with the locationsof that word in the database. A search word can then be used as an indexinto that database to directly access the information needed by thesearching front-ends. A standard textual database search may also beused to carry out a majority of the functionality of the presentinvention; however, it is not the preferred mode of operation thus, theterm “search” as used herein may refer to the indexing of the words inthe database indexing structure associated with the database. Inaddition, if a standard textual searching of the database is chosen,searching refers to the actual parsing through the database to locatethe text.

Because of the mischaracterization context, the attorney terminal 15responds by immediately searching backward through every Q & A in thecurrent transcript. The natural language front-end responds byidentifying the most significant Q & A's in the transcript, i.e., the Q& A's offering the best possible matches. The identified Q & A's areordered for sequential display by the terminal 15 upon repeateddepression of the search button of the control panel 69. The attorneycan discontinue this process at any time by pressing a cancel button ofthe control panel 69. To back-track through the search, the attorney mayselect a previous button in the control panel 69, and the terminal 15returns to the last displayed Q & A identified.

The button control panel 69 also provides conjugate and thesaurusbuttons to aid the natural language search. The functionality associatedwith the conjugate button may be applied on a verb by verb basis or onall verbs in the search window 67. Application of the conjugate buttondirects the natural language front-end to locate all conjugations of theverb selected as word alternates when attempting to locate that verb indatabases. Similarly, the thesaurus button may be applied specificallyor to every word of the search. Choosing the thesaurus button directsthe terminal 15 to consider all thesaurus type alternates for the wordselected when attempting to locate the selected word in the database.Additionally, both the conjugate and thesaurus functionality may beapplied selectively. By “double-clicking” a mouse button or by selectingeither button twice in succession, the terminal 15 locates and displaysa list of available conjugates or thesaurus alternate words. Thedisplayed alternate words may then be specifically selected individuallyor in groups so as to ignore undesired alternates. Enabling, disablingor tailoring of the conjugate or thesaurus functionality can occur atany time before or during a search.

Only those conjugate and thesaurus alternates which exist in theselected database are offered for use in the actual search. Inparticular, the terminal 15 locates all possible words alternates forthe selected search word whether or not they can be found in thedatabase to be searched. However, those alternates which cannot be foundare indicated to the attorney by displaying them in an italic font.Although the resulting search will not be able to locate the alternate,the alternate may be used to access further word alternates via a secondselection of the thesaurus or conjugate buttons for that alternate.Similarly, any alternate words located may receive the same significanceadjustment or alternate word association as any of the originallyselected search words receive. The terminal 15 also providessignificance classifications for the alternate words automatically.

As previously indicated, any word that is involved in the search whichcannot be found in the selected database are displayed in italic font inaddition to any highlighting which might be required. This featureimmediately indicates to the attorney that the word is probablymisspelled (red italics) or that the word is spelled correctly but doesnot exist in the database (blue italics). In the latter case, analternate word could be selected using the conjugate or thesaurusbuttons from the control panel 69. In the former case (redhighlighting), the terminal 15 provides a spell-check, dictionaryfunctionality upon a conjugate or thesaurus button selection. Thisfunctionality provides the attorney with potential word substitutes inmuch the way standard spell-checking software performs the task for wordprocessing. The major difference, however, is that italics andhighlighting are added to aid the attorney in selecting not only thecorrect but also a word which may be found in the database.

Additionally, from any word in the search window, the attorney mayselect the right mouse button which causes the natural languagefront-end to immediately search using the selected search word orselected portion of the search words displayed in the search window 67.This may occur after the attorney double clicks the left button to viewthe number of hits (i.e., the number of database units in which theselection may be found) and realizes that the subset search may providethe specific database unit at the root of the search. The double-leftclicking functionality may be used on a single word or multiple wordsfrom the search words displayed. If all of the search words areselected, the attorney will be provided with the total number of hitslocated.

Because the number of hits may be very high, the attorney may modify thesearch to remove the more common words. However, such removal may no benecessary if there is an exponential-like database unit probabilitydistribution. In particular, with reference to FIG. 3 b, the naturallanguage front-end of the present invention offers a sequential displayof those database units in which hits occur on a highest to lowestprobability basis. For example, a database unit containing all of thesearch terms is much more likely to offer the desired search resultsthan another database unit having only a single search word hit.Similarly, database units offer higher probability of searching successthan the others. The actual implementation of the prioritization ofdatabase units is described in more detail below in relation to FIG. 13.The relative probabilities of the database units registering hits areplotted upon attorney command in priority order in a graph window 78with relative probability on the y-axis and hit number on the x-axis. Asdetailed below, hits can be determined without having to conduct asearch. From this graph, the attorney can determine which if any of thehits to view by selecting a cut-off point using the mouse or cursor (asillustrated by the dotted line). Therefore, if an exponential typepattern is displayed (as illustrated), the attorney gleans that even ifa large number of hits have been recorded, a reasonable number ofdatabase units need only be reviewed to find desired units. However, ifinstead the attorney notices that the resulting curve is very flat, tolocate a desired database unit will probably require review asubstantial portion of all of the database units hit. Based on thegraphical review, the attorney may choose to modify the search withoutactually having to read anything if there is only a low probability ofsuccess. In addition, after selecting a portion of the hits for review,the terminal 15 computes the area under the selection of the curve toprovide an overall probability number which indicates the chances that,if a desired database unit exists in the total number of hits recorded,the selected units will contain the desired unit. This percentage numberalong with the total and selected hits is displayed in a box 76.

If the attorney decides not to review the graph, the natural languagefront-end of the present invention automatically chooses the ten (10)best hits and reports the chances that those ten (10) contain thedesired database unit (if it does exist at all) in the hits recorded.After reviewing the chance percentage, the attorney may then visit thegraphing function to better adjust the search.

Adjusting a search using the graph window 78 is done by merely alteringthe search terms in the search window and selecting the reassess controlfrom the control panel 69 while the graph window is being displayed.Doing so causes the terminal 15 to re-compute and re-plot the graph. Inaddition, by merely selecting a given term and then selecting thereassess control, the terminal 15 responds by updating the graph byusing a second color to indicate the selected words contribution to thegraph. Specifically, a single vertical bar representing a singledatabase unit which contains the selected word would be dividedvertically and color coding divided portions would indicate the amountof contribution by the selected word. In this way, the attorney canvisualize the effect of removing words or otherwise comprehend the roleof a given word in a search.

Additionally, the terminal 15 can automatically and directly annotateany text located in whole or in part to the source of the text used asthe basis for the search for later recall and analysis. For example,upon selecting an annotate command from the command line 59, theterminal 15 will annotate displayed Q & A identified to Q57. In thisway, after the proceeding, the search results can be reviewed withouthaving to reformulate a search.

Referring to FIG. 4 a, in a variation of the previous example, insteadof selecting the mischaracterization objection, the attorney may chooseto directly search a portion of Q57 by highlighting the desired termsand/or portions thereof and pressing the search control 63. Again, aspreviously illustrated, the context suggests to the terminal 15 that theQ & A database units of the current transcript should be searched.Additional databases might also be simultaneously chosen or deselectedby “checking” them at will. The database category labelled “specify”allows the attorney to define specific combinations of databases in partor in their entirety for rapid selection. The contextual conditions forselecting any of these libraries can also be configured using a setuproutine prior to beginning the proceeding. Other automatic databaseselections are also contemplated which also provide for other contextualsearch requirements that might be at issue.

After selecting desired search text 85, instead of choosing the naturalfront-end, the boolean front-end is selected from the pull-down searchmenu 65. In response, the terminal 15 opens a boolean search window 81from which a boolean type search may be formulated. Specifically, uponselecting the boolean front-end, the terminal 15 extracts the selectedtext 85, opens the search window 81, and places the selected text inwork-space therein. The attorney may copy and paste additional sectionsof text from that displayed on the screen 51 if so desired. This can beaccomplished using various editing functionality provided via an editbutton of a button control panel 83.

A strip button in the control panel 83 borrows the statistical andgrammatical computations of the natural front-end to strip out allunnecessary text captured using a quick selection of desired search textwithout having to worry about interleaving excess words. Particularly,the terminal 15 responds to the strip control by selectivelyhighlighting only those words identified as significant. By selectingthe strip button a second time, the remaining unhighlighted words(considered insignificant) are removed from the work-space 82. In theexample illustrated, by “double-clicking” on the strip button, thewindow 81 only displays “miles grandm”, by automatically stripping outthe insignificant words “to and your”. Moreover, as with the naturalfront-end, additional text may be selected, automatically added andstripped, or manually entered whenever necessary.

In general, the boolean searching front-end syntax is very similar tothat provided in other legal searching databases familiar to mostattorneys. The boolean syntax is constructed using logical operators(“&” or “OR”), wild card characters (“*” or “?”), parenthesis andbrackets to modify the specific text to be located. All of theseoperators may be easily selected and inserted from a pull-down list uponselection of an operators button in the button control panel 83.Delimiter operators are also provided. For example, a “/2” delimiterexpands the search to within two consecutive database units, while a“/3” corresponds to within three such consecutive database units, and soon. Various other types of well known delimiters are also contemplated.

Database units for the current transcript with only Q's selected via thepull-down search menu 65 is a single Q. Similarly, if Q's and A's areselected, the database unit for the delimiter would be a single Q & A.For case law, the standard database unit is a single legal decision. Formost case evidence, the database unit would be a single document. Otherdatabases similarly provide logical database units for searching.

FIG. 4 b, illustrates a resulting boolean search formulation constructedby “double clicking” twice on the strip button and selecting the “&”operator using the pull-down selection of the operators button. Theterminal 15 displays a resulting boolean “hit” at a Q&A#23 in thetranscription window 55. This search was initiated and can be continuedby selecting either search up or search down buttons from the controlpanel 83. As with any search, the identified Q&A#23 can be annotated (orassociated) to the database unit from which the search text wasextracted, in this example, Q57.

Referring to FIG. 5, the attorney terminal 15 also uses text fromcommunications displayed in the communication window 57 for searchformulations. Specifically, in the communication window 57, a messagefrom the second chair attorney is received and fully displayed in theedit mode. The message displayed, which is associated with a Q&A#60,illustrates a typical response which the first chair might receive froman associate attorney after evaluating the Q & A's 59 and 60 displayedin the transcription window 55. Believing that the witness will onlytestify in common areas of contract law, an attorney finds that a newarea of law, the law of duress, has unexpectedly become an issue in thecase. If the attorney is unaware of the specifics of the law of duress,he may not be able to seasonably extract the appropriate factualinformation from the witnesses. Timing is critical here because if thewitnesses counsel (the defending first chair attorney) has anopportunity to confer with the witness before the appropriate questionsare asked, the examining attorney may be thwarted in obtaining criticalevidence. Conferrals usually result in the “softening” of the testimony.Using the terminal 15, the attorney need only select the appropriatecommunication or portion thereof from the communication window 57. Thiscan be done from either the stack mode or the illustrated edit mode.After selection, the search control 63 is selected, and, as previouslydescribed, either a natural or boolean type search can be conducted.

Referring to FIG. 6, the selection of the search control 63 causes thedisplay of the pull-down search menu 65. The terminal 15 analyzes therequest by first looking to specific contextual clues such as previouslydetailed in regards to the misappropriation objection. Because noinitial context can be detected, the terminal 15 next evaluates thewords selected to see if any of them are legal terms, and identifies thelegal term “duress”. In identifying the term, the terminal 15 concludesthat because of the legal term duress, the attorney is most probablydesiring to search state case law. By then referencing the setup files,the attorney terminal 15 extracts the choice of law state and selectsthe corresponding libraries for searching. The terminal 15 thereafterautomatically “checks” the “law” database from the menu 65, and althoughnot shown, “checks” the appropriate state law library in a secondpull-down window, that is similar to the second pull-down menu 77 ofFIG. 3 a, which lists the various libraries of the law database whichare available for selection. The auto-selection of either the newdatabase or library selection can be modified by direct selection by“clicking” the pull-down menus.

Although not specifically shown in FIG. 6, by selecting the naturallanguage front-end, a natural searching window such as that displayed inFIGS. 3 a and 3 b is provided for potential modification as describedabove. Similarly, a boolean front-end might also be selected andinteractively invoked, as described above. In addition, if during asearch using one front-end, the attorney realizing that the other mightbe more appropriate can merely select the other front-end from thesearch pull-down menu 65, and the terminal 15 moves the selected searchwords to the newly selected search window.

In response to the search initiation, a case law database window 85 isopened to display case law retrieved. All preliminary library selection,log-in interaction, etc., happens automatically without attorneyintervention. Additionally, the default number of cases desired andtime-periods for case law search inquiries may be preset contextualinformation provided during setup.

As illustrated in FIG. 1, locally stored case law may be provided oncompact disk and managed by the CAT system 13 as described above.Remotely located case law databases accessed may be those provided byWest Services, Inc. (Westlaw®) or by Mead Data Central (Lexis®) whereinthe interfacing and case law retrieval occurs in the backgroundoblivious to the attorney. In addition, to take full advantage of anatural language searching front-end that might be associated withremote case law databases, instead of sending only the highlighted textidentified, the attorney terminal 15 might also forward the fullselected text selection to aid the remote natural language search, forexample, by enabling grammatical context analysis.

Referring to FIG. 7, a flow-chart provides a detailed illustration ofthe operation of the software routines used by an attorney terminal,such as the terminal 15, for managing a natural language front-endsearch. Whenever natural language search is selected, the terminal 15initiates the illustrated software routine at a block 101 labelled“begin”. At a block 103, the terminal 15 obtains access to a databaseindexing structure described in more detail below which contains anearly complete set of all of the possible words which might occur inthe proceeding, and which also contains pointers from every word used inthe proceeding to every Q or A that contains that word. Access to thedatabase indexing structure may be achieved in a time sharing fashionwith the CAT system 13 and other terminals on the network. In one suchconfiguration the CAT system 13 acts as a file server and databaseindexing manager via the transcription database 33. Maintenance involvesadding the location of each word transcribed to the structure.Alternatively, each attorney terminal may maintain (via transcribed wordadditions) the entire database indexing structure locally if access isshared, at the end of the proceeding, the attorney can copy thestructure locally to take with them for review without having toconsider maintenance. If maintained locally, access is virtuallyimmediate.

At a block 105, the terminal 15 generates a list of search wordscomprising each unique word and/or partial word in the text which hasbeen selected for the natural language search. As previously describedand as described in more detail below, the terminal 15 thereafter useseach of the search words to extract a significance number associatedthere with, at block 107. At a block 109, the attorney terminal 15compares the significance number of each search word with twosignificance threshold values. If the comparison indicates that a searchword is below both thresholds, it is declared insignificant. Theremaining words are at least considered significant and, therefore, willat least receive blue highlighting. If, however, any of the remainingsearch words have significance numbers above the highest of the twothresholds, the word receives red highlighting indicating enhancedsignificance.

Additionally, the terminal 15 utilizes the database indexing structureto determine whether the search words have even been used in thedatabase to be searched, at a block 110. Each word that does not existis additionally identified with italics. This procedure might also occurbefore the significance classification and may be used only onsignificant words if desired.

Upon classifying the significance and determining the existence thesearch words in the database, the terminal 15 displays the search wordsin a format indicating word significance and existence so that theattorney may immediately comprehend the nature of the search andpossibly modify the search and/or initiate the search, at a block 111.Specifically, the words believed to have the highest significancereceive red highlighting, those believed to have the lower thresholdsignificance receive blue highlighting, and those believed to beinsignificant receive no highlighting. Italics are displayed for anysearch word which does not exist in the database to be searched.

An attorney noticing a red highlighted term with italics immediatelyrealizes that the given search word was not found in the databaseindexing structure at all. This in turn indicates a possiblemisspelling. If the attorney attempts to choose alternate wordassociations using the thesaurus or conjugate buttons as previouslydescribed, a spell-checker provides a correct spelling via suggestedspelling alternate words. These suggested alternate words also utilizethe italics to indicate whether the suggested word was used or not inthe database. Specifically, only words which exist in the databaseindexing structure are offered as spelling substitution. However, evenwords in the structure which have not been used in the actual databaseitself are still offered with italics indicating the situation. Theattorney might choose such substitution merely to get to the thesaurusor conjugate functionality. Moreover, blue italics indicates that theword exists in the database indexing structure yet has not been used inthe corresponding database. Conjugate and thesaurus functionality can beaccessed to find word alternates which do exist in the database.

Editing capabilities are also provided from which the selected searchwords might be removed from or added to the unique word list or to tweakthe significance categorization if desired before initiating the searchvia a block 113. The variety of editing functionality including thegraphical interfacing is also provided at the block 111.

More specifically the thesaurus and conjugate buttons of the controlpanel illustrated in FIGS. 3 a and 3 b above, provide alternate words toexpand a search where necessary. Only, the word alternate with thehighest significance number is considered in determining the “hit”probability of a given database unit. Although a more complex averagingscheme might be used, the additional overhead does not seem justified.This is only an issue when multiple alternate words occur in a singledatabase unit.

At a block 115, after initiating the search, the terminal 15 compilesall database units that contain significant search words. As previouslydetailed, the transcript database units are Q & A's if Q's and A's wereselected for the database units to be searched. Other database unitsdepend on the database selected for the search. Again, in the preferredembodiment, “searching” actually involves direct retrieval of thelocations of words via the database indexing structure and not actualtext string searching. In the block 115, all of the locations arecomplied using the direct retrieval.

At a block 117, the terminal 15 computes the potential likelihood that agiven database unit identified at the block 115 provides the informationdesired by the attorney. Specifically, the terminal 15 sums up thesignificance numbers of the search words located in each database unitrecording “hits”. This sum is herein referred to as relativeprobability. Again more complex schema are contemplated but are notbelieved to offer any significant overall advantage.

Thereafter, at a block 119, the terminal 15 orders the database unitswith the highest relative probabilities first and lowest last. Tiesencountered in the summation of the significance numbers are decided byordering the most recently occurring database unit of the ties first.Thereafter, at a block 121, the terminal 15 provides an interactive,sequential display of the identified database units in their order ofpotential likelihood of success. Included herein is the calculation ofthe probability percentage illustrated in FIG. 3 b. The basis of thiscalculation involves a presumption that the formulated search containsone and only one database unit which will meet the attorney's goals iflocated. Certainly, this is often not a factual assumption; however, itprovides a sufficient basis for providing immediate feedback to theattorney regarding the nature of the search word(s) in the database tobe searched. To calculate the probability percentage, the terminal 15sums up all of the relative probabilities of each of the identifieddatabase units. Next, either using the default “review number” (the topten database units) or the selected review number via the graph, theterminal sums the total relative probability of these best-match units,and, by dividing the two and multiply by 100, the terminal 15 producesthe probability percentage. The block 121 also provides the variety ofinteractive search adjustments previously detailed.

Finally, the software routine ends at a block 131 whenever the attorneydiscontinues a search via the cancel button in the button control panel69.

Referring to FIG. 8, a software flow diagram is provided whichillustrates the operation of the software routines used by the terminal15 to carry out a search using the boolean front-end. The process beginsat a block 125 upon selection of the boolean front-end from thepull-down search menu 65. The text which has previously been selectedfor the search from anywhere on the display of the screen 51 (if any) iscopied and placed into the boolean search window 81 for editing at ablock 127. As previously described, copying, pasting and other editingfunctions described above in relation to FIGS. 4 a and 4 b are providedat the block 136 via the edit button of the button control panel 83 topermit the attorney to better formulate search. Also via the editbutton, the attorney has access to the thesaurus, conjugate andspell-checker routines to aid search word selection.

During the formulation stage if the attorney selects the strip button ofthe control panel 83, the terminal 15 branches to a routine forstripping out excess insignificant search words. Among other benefitswhich follow, this feature permits the attorney to rapidly select alldesirable words being displayed which are surrounded by insignificantcontextual words without having to carefully select or retype andpossibly misspell the desired search words. The attorney merely selectsa group of text containing the desired search words being displayed,presses the search control 63, selects “boolean” from the search menu65, and presses the strip button twice. The terminal 15 responds bystripping all insignificant words from the selected group of text,leaving only significant words to be further manipulated for the booleansearch.

Instead of rapidly selecting the strip button of the control panel 83,however, the attorney may perform the process in two stages. After thefirst selection of the strip button as identified via the block 129 anda block 131, the terminal 15 responds at blocks 133, 135, 137, 139 and140, in the identical way described in relation to the blocks 103, 105,107, 109 and 110 of FIG. 7, to classify the significance of the selectedwords and to identify those words that do not exist in the database tobe searched. Thereafter, highlighting and italic emphasis are placed onthe selected words in the search window 81. As previously described, theattorney may change the highlighting of the words to change theirsignificance, to provide thesaurus or conjugate word alternates, etc.,or to help identify words that will provide successful boolean “hits”.In addition, the terminal 15 also provides the attorney with a count viathe edit button of the number of times a given word (on boolean groupingof words) occurs in the database to be searched, enhancing theattorney's ability to analyze a search.

Via the decision blocks 129 and 131, if the attorney presses the stripbutton a second time, branching to a block 141 directs the attorneyterminal 15 to delete all insignificant (unhighlighted) search words.Returning to the block 127, the terminal 15 thereafter updates thedisplay by showing only the significant search words classified assignificant.

By placing the cursor between two search words or search word groupsdisplayed in the search window 81 then selecting the operators buttonfrom the button control panel 83, the attorney can easily choose andinsert the available boolean operators with having to type.Specifically, selecting the operators button results in providing apull-down series of all boolean operators which the attorney can freelychoose. Selecting one of the operators causes the terminal 15 toimmediately insert the operator at the cursor position in the searchwindow 81.

The terminal 15 also analyzes the sequence of the search words andoperators to determine if parenthesis might be needed. If so, theterminal 15 directs the attorney through a variety of matchedparenthetical associations to identify the one most appropriate for thecurrent search. For example, if the search involves the phrase “grandma& house or home”, many attorneys would not be able to determine that thesearch could be interpreted two different ways. The search could becarried out by looking for all database units containing the words“grandma” and “house”, and all database units having the word “home”alone. Alternately, the search might identify only those database unitshaving the word “grandma” and either “house” or “home”. To clarify theattorney's intent, parenthesis can be used; however, manually matchingthe numbers of parenthesis and proper placement thereof often prove tocause unnecessary delays. To avoid these difficulties, the attorney needonly respond affirmatively to an automatic parenthesis placement promptfrom the attorney terminal 15. In response, the attorney terminal 15provides all possible variations of parenthesis placement, allowing theattorney to toggle there between to locate and select the proper form.Such prompting only occurs where multiple ways of interpreting the samesearch is possible.

As operators, additional words and parenthesis are added to the search,the attorney is automatically updated as to the number of database unitsthat meet the currently displayed boolean search. In fact, the booleansearch window 81 provides counters 82 which indicate the current numberof database units and hits based on the currently displayed search orbased on a selected sub-portion thereof. As the displayed searchchanges, the terminal 15 automatically updates the displayed number ofhits. Based on the hit number, the attorney may choose to selectadditional search words and/or operators or simplify the current searchto obtain a reviewable number of hits.

Additionally, the attorney may at any time choose to switch betweensearching front-ends or switch between or add different search databasesvia the search pull-down menu 65. By switching between front-ends, theterminal 15 merely moves the current search words into the alternatesearch window and searching may continue from the same point.

Once a search has been formulated, the attorney selects the search up orsearch down buttons from the button control panel 83 as needed tosequentially access and display the database unit hits. The terminal 15carries this process out via blocks 143 and 145. Upon selecting thecancel button from the control panel 83, the terminal 15 ends theboolean search at a block 149.

FIG. 9 is a diagram representing the association of data fields into twodata records which are the basic building blocks for the overalltranscription data structure. In particular, the CAT system 13 utilizesa linked-list arrangement of two types of data records: a key-strokecode listing (KCL) record 151 and a corresponding text (CT) record 153.Although other types of records are contemplated, these two types ofrecords provide the preferred storage structure for the court reporter'scross-referencing library.

Basically, the CAT system 13 uses records 151 and 153 to associate eachindividual key-stroke code with as many subsequent key-stroke codes asproves necessary to reconstruct spoken words. Particularly, the KCLrecord 151 associates: 1) a listed key-stroke code (LKC) field 155 forstoring a specific key-stroke code; 2) a reporter listing counter field156 for storing a value indicative of the number of times that the CATsystem 13 uses the record; 3) a current listing counter field 157 forstoring a value indicative of the number of times that the CAT system 13uses the record in the current case; 4) a common listing counter 158 forstoring a value indicative of the number of times that any CAT system,including the CAT system 13, used the record; 5) a first KCL_recordpointer field 159 for storing a pointer to the next KCL record on thislevel; 6) a CT record pointer field 161 for storing a pointer to anassociated CT record; and 7) a second KCL_record pointer field 163 forstoring a pointer to a corresponding KCL_record at the next listinglevel down.

Similarly, the CT record 153 associates: 1) a CT string field 165 forstoring a string of text; 2) a reporter listing counter field 166 forstoring a value indicative of the number of times that the CAT system 13uses the current string; 3) a current listing counter field 167 forstoring a value indicative of the number of times that the CAT system 13uses the string in the current case; 4) a common listing counter 168 forstoring a value indicative of the number of times that any CAT system,including the CAT system 13, used the current string; 5) a CT recordhomonym pointer field 169 for storing a pointer to another CT recordcontaining a homonym to the contents of the CT string field 165; and 6)a grammatical word type field 170 for storing an indicator of thegrammatical type(s) of the word in the CT string field 165. Grammaticaltypes not only include the standard noun, verb, adverb etc., but alsoinclude an additional category “legal” for legal terms.

FIG. 10 is a detailed diagram representing the overall data structure ofthe cross-reference library used by the CAT system 13 to transcribe thekey-stroke codes received from the stenographic recorder. KCL records200, 201 and all KCL records (not shown) directly to the right and leftof records 200, 201 constitute a first listing level. This first listinglevel is a linked-list of the each beginning key-stroke code of thewords held in the cross-reference library 15. The KCL records are“linked” using the first KCL record pointer field 159, i.e., eachpointer field contains the address in memory where the next KCL recordresides.

All words which can be represented by a single key-stroke can be locatedusing a single KCL record at this first level. Words requiring multiplekey-strokes must identify the first key-stroke of the word in one of theKCL records at the first listing level, and that identified KCL recordshould then point via field 163 to a second listing level. For examplethe KCL records 200 points to a second listing level comprised of KCLrecords 202, 203, etc. Similarly, a third listing level exists below theKCL record 203 beginning with a KCL record 204, and so on as necessaryto reach multiple key-stroke words. Additionally, the first orsubsequent listing levels might be accessed using hashing code indexingfor increased speed in access time.

To directly identify exact text replacement using the cross-referencelibrary, the CAT system 13 would first need to know the number ofkey-strokes required to represent every given word. Because this doesnot occur, the CAT system 13 must use a searching strategy to identifythese numbers.

Because most words can be represented by a single key-stroke, the CATsystem 13 initially treats all words as a single key-stroke word. Onlyafter detecting transcription problems with subsequent key-strokes willthe CAT system 13 back-track and consider whether the key-stroke mightbe the first of a multiple key-stroked word. In particular, using theidentified KCL records constituting a second listing level, the CATsystem 13 must locate a single KCL record containing the secondkey-stroke in the multiple key-stroke series. The identified KCL recordat this second level will point to a subsequent level for providing asubsequent key-stroke in the multiple key-stroke series. This processcontinues until the last key-stroke is identified.

In addition, each of the KCL records at any listing level may or may notpoint via the field 161 to associated text. If a single word correspondsto a single key-stroke, the identified KCL record in the first listinglevel will point to a CT record which contains the text of that word.Similarly, a KCL record in the first listing level will point to a CTrecord which contains the text of that word. Similarly, a KCL record atthe second level identified for a word represented by two key-strokeswill point to a CT record containing the actual text of that word. Inthis way, any key-stroke or series of key-strokes which represent a wordcan be transcribed if the cross-reference library contains the path tothe word formed by the key-stroke(s) of that word, i.e., if thecross-reference library contains the text counterpart.

More particularly, upon receiving the first key-stroke code from asentence, the code is compared with each key-stroke code stored in eachKCL record on the first listing level. For example, if the received codedoes not match the stored code in the listed key-stroke code field 155of the KCL record 200, the CAT system 13 uses the contents of the field159 of KCL record 200 to access the next KCL record, the record 201, fora similar comparison to the code stored therein. In this manner, bystepping through the first listing level, a matching KCL record can befound.

Assuming that the code stored in KCL record 200 does match the firstkey-stroke code received, the CAT system 13 accesses the associated CTrecord 205 to retrieve readable cross-referenced text. Additionally inthis example, the CT record 205 provides the CAT system 13 with apointer to a homonym stored in a CT record 207. The text located in CTrecords 205 and 207 possibly provide the desired transcription, but onlyby transcribing the entire sentence can the CAT system 13 be sure. Oftentimes, the CAT system 13 discards such text in favor of multiplekey-stroke text. Particularly, the CAT system 13 uses the KCL record 200as a back-tracking point. If in transcribing the sentence, the KCLrecord 200 only proves to be the first of two key-strokes, the CATsystem 13 uses the KCL record 200 points to access a second listinglevel. This second listing level is specifically associated with the KCLrecord 200 and begins with KCL records 202 and 203 followed by all KCLtype records (not shown) to the right of record 203. Any second codereceived which follows a first code which matches that stored in the KCLrecord 200 is compared to the codes stored in the KCL records on thesecond listing level. The KCL record 204 represents yet a third listinglevel under the key-stroke sequence stored in the record 200 and 203,and so on. CT records may or may not be associated with a given KCLrecord, depending on whether a corresponding word exists for therepresented key-stroke code sequence. The KCL record 202 exemplifiessuch a situation.

Only a single CT record is generally associated with a single KCLrecord, such as is shown with KCL record 203 and a CT record 209. Onlywhen homonyms exist will there be multiple CT record association, asillustrated with the KCL record 200 and the CT records 205 and 207.Multiple CT record associations, however, are indirect in that each KCLrecord can only identify, i.e., point to a single CT record. AdditionalCT record “homonyms” are pointed to by the identified CT record.

Upon receiving a first key-stroke code of a sentence from stenographicrecorder 11, the CAT system 13 begins a transcription expedition byparsing through a first listing level of the cross-reference library inan attempt to find a matching KCL record. If a matching KCL record isfound which has an associated CT record transcription, the CAT system 13records the match and treats the second (next) code received as thebeginning of a new word by parsing the first listing level.

If a matching KCL record is found for first code received which has noassociated CT record, the CAT system 13 treats the second key-strokecode received as the second part of the word by branching to the secondlisting level pointed to by the matching KCL record (on the firstlisting level). Note that if properly constructed, there should never beany KCL record which has neither a pointer in field 161 to an associatedCT record or a pointer in field 163 to a subsequent level of KCLrecords. If a match is found at the second listing level with anassociated CT record transcription, the CAT system 13 treats the thirdkey-stroke code received as the beginning of a new word by parsing thefirst listing level, repeating the cycle.

If after transcribing a series of key-strokes in a sentence, the CATsystem 13 encounters a dead end, i.e., an associated CT record cannot beidentified, back-tracking must occur. The CAT system 13 returns to thelast matching KCL record of the previously transcribed word, andcontinues the transcription process through subsequent listing levels tosee if what had been considered an entire word is really only a portionthereof. If a match is found with an associated CT record transcription,the CT record at that subsequent (deeper) listing level is stored, andthe following key-stroke code received is treated as the beginning of anew word, repeating the cycle.

With each successive, unsuccessful parsing round, the previouslydescribed transcription process becomes more and more complex withpotentially many parallel and nested transcription pathways beingconsidered. If available, the first completely transcribed sentencefound is communicated to attorney terminals 15 and 16. Otherwise, thesentence formulation with the greatest number of key-strokes transcribedwill be prepared for communication.

Additionally, the CAT system 13 not only adds the transcribed words tothe transcription database 33, but also maintains the correspondingdatabase indexing structure by adding the location of each instance ofthe word thereto. This process is set forth in greater detail in regardsto FIG. 12 below

FIG. 11 is a detailed diagram representing the association of datafields into the types of data records which are the basic buildingblocks of the database indexing structure illustrated in FIG. 12associated with each database of the present invention. Through the useof the data records illustrated, the indexing structure provides theattorney terminals with virtually instant feedback regarding a varietyof information such as the location of any word in the associateddatabase to be searched. Although preferably located and managed at thesame location as the associated database, the indexing structure mightalso be located locally for quick access.

In particular, the back-bone of the indexing structure involves wordrecords, such as a word record 173, for associating a variety ofinformation for a given word found in the associated database. Inparticular, each word record used is assigned to a specific word thetext of which is stored in a text field 177.

A current_listing_counter field 179 stores a count representative of thenumber of database units of the associated database that the word storedin the text field 177 can be found. A grammatical_word_type field 181stores an indicator of the grammatical word type(s) of the word storedin the text field 177. Grammatical types include nouns, verbs, etc., aswell as the “legal” type (described previously). Although notspecifically shown, there are a variety of legal types which carryspecific information which can be used to locate the exact library inthe case law database to be searched. For example, the term duress isassociated with grammatical legal type which indicates state lawlibraries should be searched. Along with setup files indicating that thechoice of law state is for example Illinois, the attorney terminals canimmediately select the appropriate state law database libraries forsearching. Many words have multiple grammatical types. To accommodatethem, the field 181 provides storage of an indicator which not onlyprovides multiple type indications, but also provides informationregarding the relative frequency of usage of each possible type.

Additionally, a significance number field 184 stores the significancenumber which provides the searching front-ends of the present inventionwith an automatic indication of the significance of the word stored inthe text field 177. Further detail illustrating exemplary calculationsof such significance numbers is provided below in relation to FIG. 13.

For access to alternate conjugate verb forms, a conjugate_text_pointerfield 185 is provided. Similarly, a thesaurus_text_pointer field 187provides access to the text of words relating to the word stored in thetext field 177 which might be found in a standard thesaurus. Both thethesaurus and conjugate pointer fields 187 and 189 point to respectivecircular queues of related word records.

The word records, such as the record 173, also provides a usage_pointerfield which points to a linked list of all database units that containthe word stored in the text field 177. Specifically, the usage pointerfield 189 points to a linked-list of usage records, such as a usagerecord 175, which contains: 1) a database unit type field 191 forstoring the type of database unit in which the associated word can befound; 2) a database unit pointer field 193 for storing a pointer to thedatabase unit in the actual database where the word can be found; and 3)a next_usage_record_pointer field 195 for identifying the next usagerecord in the linked-list which, if exists, stores the same informationregarding the next usage of the associated word in the database. Thecurrent_listing_counter 179 provides the number of usage records in thelinked-list.

All new databases, i.e., those containing no words, utilize the samedatabase indexing structure having identical word record entries andlayout. Differences only appear as unique sequences are added to thedatabase requiring unique association of the indexing structure withusage records. Therefore, the entire new (clean) database indexingstructure can be copied and readily applied to new databases as needed.Similarly, clear database indexing structure can be easily extractedfrom a current structure in use. Therefore, each database indexingstructure is associated with a specific release number which indicatesto the attorney the level of completeness that a current version may ormay not have.

FIG. 12 is a diagram illustrating the interconnection of the word andusage records of the database indexing structure used by the presentinvention by both the boolean and natural language searching front-ends.To identify the location of a specific word in a given database or todetermine whether the database even contains the word, instead ofrequiring a complete sequential search through the database, thedatabase indexing structure of the present invention provides suchinformation immediately via indexing without requiring any textualsearching.

Specifically, an indexing system as that illustrated is associated witheach database to be searched. When, through database selection describedpreviously, a database is selected for searching, the attorney terminal,such as the terminal 15, first gains access to the indexing structureillustrated in FIG. 12 the selected database. The indexing structure forthe current transcript is maintained by the CAT system 13 in thetranscription database 33. Similarly, the storage and maintenance of thestructure may be handled by each attorney terminal or by any othercomputer at a remote location.

Once access to the indexing structure of the desired search database hasbeen established, the attorney terminal, such as terminal 15 can easilyidentify whether a specific search word formulated by the attorney usingthe boolean or natural searching front-ends exists in the database, and,if so, how many times and at what specific locations. To accomplishthis, the terminal 15 merely converts the text of search word 243 to ahash code using a typical hashing algorithm 249. The terminal 15accesses the specific word record corresponding to the hashed searchword via hashing array 251. In particular, the terminal 15 utilizes thehash code generated as an index to a word pointer which points to thespecific word record at issue. From the word pointer identified, theterminal 15 then locates the desired word record which provides accessto all of the information needed to conduct a search. For example, ahash code stored at hash code index 252 of the hashing array 251provides immediate access to a word record 253 via a word pointer storedin a field 254. Similarly, the attorney terminal might access any otherword record stored in the database indexing structure.

Once a specific word record is located, the attorney terminal 15 hasimmediate access all of the fields stored therein. Particularly, thesignificance number_field 184, which provides the significance number ofthe search word boolean or natural language front-end searching. Via thecurrent_listing-counter field 179, the attorney terminal receives animmediate indication as to the number of times if any that the searchword exists in the database. As described previously, italics are addedto the display of those search words having no usage in the database.Moreover, through the usage-pointer field 189, the word record providesthe attorney terminal with the location of a search word in associatedthe database. For example, the usage_pointer field 189 of the wordrecord 253 provides direct access to a linked-list of usage records 267,269, 271, etc. Each of these usage records identify the type andlocation of a single database unit which contains the search word.Another two fields might also be added to provide the exact position ofthe word in the identified database unit, although not shown. Doing soprovides for complete reconstruction of the textual database from thedatabase indexing structure alone, wherein the second field is used as apointer to the next word usage record in sequential order.

Using the database indexing structure, the attorney terminal 15 can thenperform all boolean and natural language functions without actuallysearching the database. The database indexing structure provides rapidaccess to all of the information needed to aid the attorney informulating a search without having to perform any textual scanning typesearching.

Also provided without actually searching the database, the indexingstructure provides lists of available thesaurus and conjugate“alternate” words which exist in the database to be searched. Forexample, a circular que of thesaurus type alternate words is providedvia the thesaurus text pointers 187 of the word record 253 and wordrecords 261, 263 and 265. Similarly, the conjugate alternate words areprovided via the conjugate text pointers 187 of the word record 253 andword records 257 and 259. Although three total words exist in theexemplary conjugate word circular que, no words or as many conjugateforms as may exist may be included. Similarly, more or less thesaurustype alternate words may also be included in the thesaurus circular que.In addition, each word stored in the word records of any such circularque can provide access to all of the others. For example, if instead ofselecting the search word stored in the word record 253, the attorneychooses an alternate search word stored in the word record 257, theselection of the conjugate button of the search window (described below)provides access to the alternate words stored in the word records 259and 253 by merely stepping through the circular que. Each alternate wordis presented to the attorney with italics and highlighting whererequired.

In a preferred embodiment where storage space is not an issue, eachdatabase indexing structure provides indexing to nearly all of thepossible words used in a given language with associated preset circularques for conjugates and thesaurus word linkages and significance andgrammatical information. This same database indexing structure is usedas a default structure for all databases. Only as word usage records areadded will the database indexing structure become unique to acorresponding textual database. As each word is added to a presetdatabase structure, the corresponding word record information is updatedand new usage records are added in a first out fashion onto thelinked-list of usage records via the pointer field 189 of the wordrecord. When performing the front-end searching functions involving theidentification of a given search word which turns out to have never beenused, instead of finding a dead-end because of a missing word record,the word record would be located so that potential thesaurus andconjugate word alternatives which have been used could be identified foradoption. Additionally, every occurrence of every word found in thedatabase, no matter how common the word, separate usage record isprovided to identify each such usage instances.

If storage space is of concern, only those word records which have beenused are added as they are needed to the database index structure of thedatabase at issue. However, doing so will tend to minimize thefunctionality of the thesaurus and conjugation buttons. An alternatestorage space saving approach would be to disregard all extremely commonwords such as “the” “a”, etc., by not storing any usage records forthese words at all. Instead, the significance number of the associatedword record storing the extremely common word would indicate to theattorney terminal that the actual positions are not available. Todetermine whether to save such usage records, the significance numbercould be compared to a third threshold level set low enough to strip outonly the most insignificant of all possible words. The third thresholdcould be adjusted to pare-down the size of the database index structure.

Another way of paring-down the storage size of the database indexingstructure is to only allow a single usage record to be added for anyword record for any one database unit. In other words, no matter howmany times the word “the” occurs in a single database unit, only oneusage record is permitted to be added to the word record correspondingto the word “the”.

As mentioned previously, by adding additional fields to the usagerecord, a sequential linkage of all words in the textual database canprovide for easy reconstruction of the textual database counterpart. Infact, the counterpart itself may never be needed. Similarly, instead ofextracting text from a remote location, the hashing codes might insteadbe transmitted. As long as a copy of the same preset database indexingstructure exists on the sending and receiving end, the actual text canbe easily be reconstructed for display. In addition, because of theinherent compression occurring with the hash code length versus the textlength, the speed of data exchange can be increased dramatically.Similarly, the size of the files will decrease. Estimates indicate thatat least a three to one (3:1) compression factor can be easily achieved.

In one embodiment of the present invention, as the preset databaseindexing structure is personalized, i.e., words are added thereto, asequential hash code file 275 is created from which transmissions mayoriginate. The sequential file 275 consists of a database unit numbertable 277 which provides access to a series of associated hash codesequences 279 for sequentially storing the hashing code for each wordused in the database. Specifically, to create, for example, a newtranscript file, the CAT system 13 (or any attorney terminal) providesan indication of first database unit, a question #1 (Q1), at the block245. Thereafter, each word transcribed for Q1 is sequentially providedvia a block 243 to a typical hashing algorithm at a block 249. A hashcode for each word is thus generated.

The database unit number, Q1, is added to the database unit table whichassigns a pointer to an individual hashing sequence of the sequences 279which begins to sequentially store each hash code generated therein.Whenever the database unit changes as indicated via the block 245 forexample to the first answer (A1), a new entry in the table 277 is madewhich provides access to a new storage space for the next series of hashcodes via another hashing sequence in the sequences 279.

At the same time, each hash code generated and database unit indicatoris used to add a usage record to the appropriate word records. Inparticular, the hash code generated is indexed into the hashing array251 which provides a pointer to the specific word record of the currentword. Once the word record is located, the current_listing_counter 179therein is incremented to indicate the new usage. Also, a new usagerecord is created by storing both the database unit number via theindicator provided at the block 245 in the pointer field 193, and thedatabase unit type also provided via the block 245 in the type field191. The usage record is thereafter added to the linked-list of usagerecords associated with that word record.

To accommodate words not found in the hashing array 251: 1) the hashingcode for the new word is added as a entry in the hashing array 251; 2) anew word record is created for the new word and the pointer there to isplaced in the word pointer field associated with the new hash code ofthe hashing array 251; 3) the text of the new word is inserted into thetext field 177 of the new word record; and 4) the significance numberstored in the field 184 of the new word is set to the maximumsignificance level because of its uniqueness. Additionally, instead ofsaving the new hashing code alone in the sequential file 275, an escapesequence character is placed in the hashing sequence 279 followed by theactual text of the new word and a closing escape sequence-character. Inthis way, the text of new words is directly included in the hash codesequence of the sequential file 275 so that any recipient canreconstruct the full text using only a clean, preset database indexingstructure along with sequential file 275. Also, the sequential file 275,or any portion thereof via database unit table 277 look-up, can berapidly transmitted into the environment of a second database indexingstructure for reconstruction of the word text in whole or in part.Inherent compression also adds to these benefits. Moreover, byretrieving as many sequential files, such as the file 275, as possible,the new words encountered can be used as a basis for building an evenmore complete preset database structure that can then be redistributed,providing better indexing coverage.

FIG. 13 is a diagram representing an approach used by the presentinvention to construct the significance number for a given word found inthe present invention. Although more complex schemes may be used whichtake into account actual grammatical usage in the sentence context, theamount of overhead associated therewith (in response time and CPUdedication) to obtain a “better” significance valuation may not bejustified. This is not only because of the relatively small benefitadded, but also, in view of the interactive nature of the front-endtools provided by the present invention, the potential benefits havelittle impact on the attorney's ability to locate a desired search.

As illustrated, the significance number is generated considering boththe grammatical type and commonality of a given word. Words which arevery are common are less likely to be desirable for identifying aspecific database unit out of the many recorded in an entire database.Similarly, grammatical word types which only provide syntax support in alanguage such as an article, whether common or not provided littleinterest in the identification of a desired database. Therefore, whencombined, the commonality and grammatical type of a word offers a verygood indication as to the significance of a word for a given search.

Specifically, the significance number used in an embodiment of thepresent invention ranges from Zero (0) to one hundred (100) whichresults in a combination of offset values generated from an offset table301 and the statistical commonality of the word as illustrated inexemplary listing of sample offsets 303. For example, the term “duress”has grammatical word type “legal” as indicating by the type field 305 ashaving an offset value of fifty (50). Because the term is alsoconsidered uncommon statistically as represented in a description field327, an offset value of forty (40) is added to provide a totalsignificance number of ninety (90). Similarly, the article “the” isextremely common, thus, via a type field 321 and a description field337, a significance number of fifteen (15) is generated. New wordsencountered which have not been grammatically typed are consideredextremely uncommon and automatically given a maximum significance numberof one hundred (100). As described below, this ensures that not onlywill the word be easily recognized as new, but also the word willreceive appropriate highlighting (red) indicating the highestsignificance to the attorney during searching.

As previously described, the present invention operates using a firstthreshold to classify a word that has a higher significance than otherwords. The first, higher threshold value is set at a significance scoreof eighty (80) as a default. When displayed in the searching windows,search words having a significance number of eighty (80) or greaterreceive red highlighting. Similarly, the lower significance number rangefor receiving blue highlighting involves a second threshold value ofsixty (60). Therefore, a word with a significance number of greater thansixty (60) but less than eighty (80) receives blue highlighting. Allwords having significance numbers less than sixty (60) receive nohighlighting, and are classified as insignificant. Such words are notconsidered in any searching formulations unless manually overridden bythe attorney as previously described.

If the conjugate or thesaurus buttons are selected, the alternate wordslocated via the respective circular queues in the database indexingstructure provide grounds for alternate significance highlightingcalculations. Specifically, the significance number of the most commonalternate word is used as the significance number for providing apossibly blue highlighting color instead of red for the selected word.This simple scheme works well with only two levels of highlighting butmay be modified to provide for situations in which a multitude of othercolors are involved. In such situations at least blue highlighting at aminimum is displayed even if the second threshold is not met, becausethe mere selection of word alternates indicates that the attorneyconsiders the words to be important in the search. In addition, thesignificance number might be adjusted based on the current databaseusage, but is not preferred for similar reasons.

Additionally, although the features associated specifically withsearching are shown only in the context of a legal proceeding, they arealso contemplated to operate in other pre or post proceeding situationsto aid the attorney's searching.

Although circular queues and linked-list are preferred, the presentinvention contemplates many database structural modifications whichmight be made to the embodiments disclosed herein. Similarly, the flowand operation described above is merely an embodiment of the manypossible ways of carrying out the specific objects of the presentinvention. It is obvious that the embodiments of the present inventiondescribed hereinabove are merely illustrative and that othermodifications and adaptations may be made without departing from thescope of the appended claims.

1. An attorney terminal for performing database searching comprising:display means which is electronically controllable for displayingalphabetic and numeric text; means for providing the display screen withalphabetic and numeric text provided for a non-searching reason; saidattorney terminal responding to the providing means by displaying theprovided alphabetic and numeric text; and a searching front-endselectively responding to provided alphabetic and numeric text byperforming a search. 2-4. (canceled)